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In particular, we show that already for binary operators the black-box 
complexity of LeadingOnes drops from 8(?7^) for unary operators to 
0{nlogn). For OneMax, the il{n\ogn) unary black-box complexity 
drops to 0{n) in the binary case. For fc-ary operators, k < n, the 
ONEMAX-complexity further decreases to 0{n/ \ogk). 

1 Introduction 

When we analyze the optimization time of randomized search heuristics, we 
typically assume that the heuristic does not know anything about the ob- 
jective function apart from its membership in a large class of functions, e.g., 
linear or monotone functions. Thus, the function is typically considered to 
be given as a black-box, i.e., in order to optimize the function, the algorithm 
needs to query the function values of various search points. The algorithm 
may then use the information on the function values to create new search 
points. We call the minimum number of function evaluations needed for a 
randomized search heuristic to optimize any function / of a given function 
class J- the black-box complexity of T. We may restrict the algorithms with 
respect to how it creates new search points from the information collected 
in previous steps. Intuitively, the stronger restrictions that are imposed on 
which search points the algorithm can query next, the larger the black-box 
complexity of the function class. 

Black-box complexity for search heuristics was introduced in 2006 by 
Droste, Jansen, and Wegener [DJW06]. We call their model the unrestricted 
black-box model as it imposes few restrictions on how the algorithm may cre- 
ate new search points from the information at hand. This model was the first 
attempt towards creating a complexity theory for randomized search heuris- 
tics. However, the authors prove bounds that deviate from those known for 
well-studied search heuristics, such as random local search and evolutionary 
algorithms. For example, the well-studied function class OneMax has an 
unrestricted black-box complexity of 0(n/logn) whereas standard search 



Model 


Arity 


OneMax 




LeadingOnes 


unbiased 
unbiased 


1 

I < k <n 


O(nlogn) 
0(n/ log k) 


[LWIO] 
(here) 


G(n2) [LWIO] 
0(n log n) (here) 


unrestricted 


n/a 


0,{n/ logn) 
0{n/ log n) 


[DJW06] 
[AW09] 


n{n) [DJW06] 



Table 1: Black-Box Complexity of OneMax and LeadingOnes. Note that 
upper bounds for the unbiased unary black-box complexity immediately 
carry over to higher arities. Similarly, lower bounds for the unrestricted 
black-box model also hold for the unbiased model. 
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heuristics only achieve a Q{nlogn) runtime. Similarly, the class Leading- 
Ones has a linear unrestricted black-box complexity but we typically ob- 
serve a O(n^) behavior for standard heuristics. 

These gaps, among other reasons, motivated Lehre and Witt [LWIO] 
to propose an alternative model. In their unbiased black-box model the 
algorithm may only invoke a so-called unbiased variation operator to create 
new search points. A variation operator returns a new search point given one 
or more previous search points. Now, intuitively, the unbiasedness condition 
implies that the variation operator is symmetric with respect to the bit 
values and bit positions. Or, to be more precise, it must be invariant under 
Hamming- automorphisms. We give a formal definition of the two black-box 
models in Section 2. 

Among other problem instances, Lehre and Witt analyze the unbiased 
black-box complexity of the two function classes OneMax and Leading- 
Ones. They can show that the complexity of OneMax and LeadingOnes 
match the above mentioned ©(nlogn) and, respectively, Q{n?) bounds, if 
we only allow unary operators. Le., if the variation operator may only use 
the information from at most one previously queried search point, the un- 
biased black-box complexity matches the runtime of the well-known (1 -|- 1) 
Evolutionary Algorithm. 

In their first work, Lehre and Witt give no results on the black-box 
complexity of higher arity models. A variation operator is said to be of arity 
k if it creates new search points by recombining up to k previously queried 
search points. We are interested in higher arity black-box models because 
they include commonly used search heuristics which are not covered by the 
unary operators. Among such heuristics are evolutionary algorithms that 
employ uniform crossover, particle swarm optimization [KEOl], ant colony 
optimization [DS04] and estimation of distribution algorithms [LL02]. 

Although search heuristics that employ higher arity operators are poorly 
understood from a theoretical point of view, there are some results proving 
that situations exist where higher arity is helpful. For example, Doerr, Klein, 
and Happ [DHK08] show that a concatenation operator reduces the runtime 
on the all-pairs shortest path problem. Refer to the same paper for further 
references. 

Extending the work from Lehre and Witt, we analyze higher arity black- 
box complexities of OneMax and LeadingOnes. In particular, we show 
that, surprisingly, the unbiased black-box complexity drops from 0(n^) in 
the unary case to 0(n log n) for LeadingOnes and from G(nlogn) to an 
at most linear complexity for OneMax. As the bounds for unbiased unary 
black-box complexities immediately carry over to all higher arity unbiased 
black-box complexities, we see that increasing the arity of the variation 
operators provably helps to decrease the complexity. We are optimistic that 
the ideas developed to prove the bounds can be further exploited to achieve 
reduced black-box complexities also for other function classes. 
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In this work, we also prove that increasing the arity further does again 
help. In particular, we show that for every k < n, the unbiased fe-ary black- 
box complexity of OneMax can be bounded by 0(n/ log k). This bound is 
optimal for k = n, because the unbiased black-box complexity can always 
be bounded below by the unrestricted black-box complexity, which is known 
to be n{n/logn) for OneMax [DJW06]. 

Note that a comparison between the unrestricted black-box complexity 
and the unbiased black-box complexity of LeadingOnes cannot be derived 
that easily. The asymptotic linear unrestricted black-box complexity men- 
tioned above is only known to hold for a subclass of the class LeadingOnes 
considered in this work. 

Table 1 summarizes the results obtained in this paper, and provides a 
comparison with known results on black-box complexity of OneMax and 
LeadingOnes. 

2 Unrestricted and Unbiased Black-Box Complex- 
ities 

In this section, we formally define the two black-box models by Droste, 
Jansen, and Wegener [DJW06], and Lehre and Witt [LWIO]. We call the 
first model the unrestricted black-box model, and the second model the un- 
biased black-box model. Each model specifies a class of algorithms. The 
black-box complexity of a function class is then defined with respect to the 
algorithms specified by the corresponding model. We start by describing the 
two models, then provide the corresponding definitions of black-box com- 
plexity. 

In both models, one is faced with a class of pseudo-Boolean functions J- 
that is known to the algorithm. An adversary chooses a function / from 
this class. The function / itself remains unknown to the algorithm. The 
algorithm can only gain knowledge about the function / by querying an 
oracle for the function value of search points. The goal of the algorithm 
is to find a globally optimal search point for the function. Without loss of 
generality, we consider maximization as objective. The two models differ in 
the information available to the algorithm, and the search points that can 
be queried. 

Let us begin with some notation. Throughout this paper, we consider the 
maximization of pseudo-Boolean functions / : {0, 1}"" — )■ M. In particular, n 
will always denote the length of the bitstring to be optimized. For a bitstring 
X € {0, !}"■, we write x = xi ■ ■ ■ Xn- For convenience, we denote the positive 
integers by N. For A: S N, we use the notion [k] as a shorthand for the set 
{l,...,k}. Analogously, we define [0..A:] := [k] U {0}. Furthermore, let Sk 
denote the set of all permutations of [k]. With slight abuse of notation, we 
write a{x) := Xa-(i) • • • x^(^n) for ^ ^n- Furthermore, the bitwise exclusive- 
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or is denoted by ©. For any bitstring x we denote its complement by x. 
Finally, we use standard notation for asymptotic growth of functions (see, 
e.g., [CLRSOl]). In particular, we denote by On{g{n)) the set of all functions 
/ that satisfy lim„^oo fin) /gin) = 0. 

The unrestricted black-box model contains all algorithms which can be 
formalized as in Algorithm 1. A basic feature is that this scheme does 
not force any relationship between the search points of subsequent queries. 
Thus, this model contains a broad class of algorithms. 



Algorithm 1: Unrestricted Black-Box Algorithm 

1 Choose a probability distribution pQ on {0, 1}". 

2 Sample according to po and query f{x^). 

3 for t = 1, 2, 3, . . . until termination condition met do 

4 Depending on {{x^, f{x^)), . . . , {x^~^, f{x*^^))), choose 

5 a probability distribution on {0, 1}". 

6 Sample according to p*, and query f{x^). 



To exclude some algorithms whose behavior does not resemble those of 
typical search-heuristics, one can impose further restrictions. The unbiased 
black-box model introduced in [LWIO] restricts Algorithm 1 in two ways. 
First, the decisions made by the algorithm only depends on the observed 
fitness values, and not the actual search points. Second, the algorithm can 
only query search points obtained by variation operators that are unbiased 
in the following sense. By imposing these two restrictions, the black-box 
complexity matches the runtime of popular search heuristics on example 
functions. 

Definition 1. (Unbiased fc-ARY variation operator [LWIO]) Let k e 
N. An unbiased fc-ary distribution D{- \ x^, . . . is a conditional proba- 
bility distribution over {0, 1}", such that for all bitstrings y,z £ {0, 1}", and 
each permutation a € 5^, the following two conditions hold. 

(i) D{y I x\ . . . , x^) = D{y ® z \ x^ ® z, . . . ,x^ ® z), 

(ii) D{y \x\...,x^) = D{a{y) \ a{x^), . . . , a{x^)) . 

An unbiased fc-ary variation operator p is a k-ary operator which samples 
its output according to an unbiased k-ary distribution. 

The first condition in Definition 1 is referred to as (B-invariance, and 
the second condition is referred to as permutation invariance. Note that 
the combination of these two conditions can be characterized as invariance 
under Hamming-automorphisms: D{- \ x^,...,x^) is unbiased if and only 
if, for all a : {0, 1}" — ?> {0, 1}" preserving the Hamming distance and all 
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bitstrings y, D{y \ x^, . . . ,x^) = D{a{y) \ a{x^), . . . ^a{x^)). We refer to 1- 
ary and 2-ary variation operators as unary and binary variation operators, 
respectively. The unbiased /c-ary black-box model contains all algorithms 
which follow the scheme of Algorithm 2. While being a restriction of the 
old model, the unbiased model still captures the most widely studied search 
heuristics, including most evolutionary algorithms, simulated annealing and 
random local search. 

Note that in line 5 of Algorithm 2, y^ , . . . ,y^ don't necessarily have to be 
the k immediately previously queried ones. That is, the algorithm is allowed 
to choose any k previously sampled search points. 

We now define black-box complexity formally. We will use query com- 
plexity as the cost model, where the algorithm is only charged for queries 
to the oracle, and all other computation is free. The runtime 7a,/ of a ran- 
domized algorithm ^ on a function / € is hence the expected number 
of oracle queries until the optimal search point is queried for the first time. 
The expectation is taken with respect to the random choices made by the 
algorithm. 

Definition 2 (Black-box complexity). The complexity of a class of pseudo- 
Boolean functions T with respect to a class of algorithms A, is defined as 
Ta{^) ■= minAe^max/gjrTAj. 

The unrestricted black-box complexity is the complexity with respect to 
the algorithms covered by Algorithm 1. For any given k £ N, the unbiased 
k-ary black-box complexity is the complexity with respect to the algorithms 
covered by Algorithm 2. Furthermore, the unbiased *-ary black-box complex- 
ity is the complexity with respect to the algorithms covered by Algorithm 2, 
without limitation on the arity of the operators used. 

It is easy to see that every unbiased k-ary operator p can be simulated 
by an unbiased (A;-|-l)-ary operator p' defined as p'{z \ x^, . . . , x'^, x'^"*"^) := 
p{z I x^,...,x^). Hence, the unbiased k-ary black-box complexity is an 
upper bound for the unbiased (A;-|-l)-ary black-box complexity. Similarly, 
the set of unbiased black-box algorithms for any arity is contained in the set 
of unrestricted black-box algorithms. Therefore, the unrestricted black-box 
complexity is a lower bound for the unbiased A;-ary black-box complexity 
(for ah /c G N). 

3 The Unbiased *-Ary Black-Box Complexity of 
OneMax 

In this section, we show that the unbiased black-box complexity of OneMax 
is G(n/logn) with a leading constant between one and two. We begin with 
the formal definition of the function class OneMax^. We will omit the 
subscript "n" if the size of the input is clear from the context. 
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Algorithm 2: Unbiased fc-ary Black-Box Algorithm 

1 Sample uniformly at random from {0, 1}" and query f{x^)- 

2 for t = 1, 2, 3, . . . until termination condition met do 

3 Depending on {f{x^), . . . , f{x^~^)), choose 

4 an unbiased A;-ary variation operator p*, and 

5 k previously queried search points y^, ■ ■ ■ ,y^. 

6 Sample x* according to p^{y^, . . . , y^), and query /(x*). 



Definition 3 (OneMax). For all n e N and each z G {0,1}" we define 
Om^ : {0,1}" N,x ^ \{j G [n] \ Xj = Zj}\} The class OneMax^ is 
defined as OneMax^ := {Om^ | z G {0, 1}"} . 

To motivate the definitions, let us briefly mention that we do not further 
consider the optimization of specific functions such as 0M(^^ since they 
would have an unrestricted black-box complexity of 1 : The algorithm asking 
for the bitstring (1, . . . , 1) in the first step easily optimizes the function in 
just one step. Thus, we need to consider some generalizations of these 
functions. For the unrestricted black-box model, we already have a lower 
bound by Droste Jansen, and Wegener [DJW06]. For the same model, an 
algorithm which matches this bound in order of magnitude is given by Anil 
and Wiegand in [AW09]. 

Theorem 4. The unrestricted black-box complexity of OneMax„ is 
0(n/logn). Moreover, the leading constant is at least 1. 

As already mentioned, the lower bound on the complexity of OneMax„ 
in the unrestricted black-box model from Theorem 4 directly carries over to 
the stricter unbiased black-box model. 

Corollary 5. The unbiased *-ary black-box complexity of OneMax^ is at 
least n/logn. 

Moreover, an upper bound on the complexity of OneMax in the unbi- 
ased black-box model can be derived using the same algorithmic approach 
as given for the unrestricted black-box model (compare [AW09] and Theo- 
rem 4). 

Theorem 6. The unbiased *-ary black-box complexity of OneMax„ is at 
most (l+o„(l))i|^. 

In return, this theorem also applies to the unrestricted black-box model 
and refines Theorem 4 by explicitly bounding the leading constant of the 
unrestricted black-box complexity for OneMax by a factor of two of the 

^Intuitively, OMz is the function of n minus the Hamming distance to z. 
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lower bound. The result in Theorem 6 is based on Algorithm 3. This 
algorithm makes use of the operator unif ormSample that samples a bit- 
string uniformly at random, which clearly is an unbiased (0-ary) vari- 
ation operator. Further, it makes use of another family of operators: 
chooseConsistent^i ^((x^, . . . , x*) chooses a z € {0, 1}" uniformly at ran- 
dom such that, for all i € [t], OMz{x^) = (if there exists one, and any 
bitstring uniformly at random otherwise). It is easy to see that this is a 
family of unbiased variation operators. 



Algorithm 3: Optimizing OneMax with unbiased variation operators. 

1 input Integer n € N and function / S OneMax„; 

2 initialization + 

3 repeat 

4 
5 

6 



foreach i € [t] do 
1^ unif ormSample (); 



w ^ chooseConsistentj(2;i)....,/(x')(^^i • • • i 

7 until f{u!) = n; 

8 output w; 



An upper bound of (1 + o„(l))2n/ log n for the expected runtime of 
Algorithm 3 follows directly from the following theorem which implies that 
the number of repetitions of steps 4 to 6 follows a geometric distribution 
with success probability 1 — o„(l). This proves Theorem 6. 

Theorem 7. Let n he sufficiently large (i. e., let n > Nq for some fixed 
constant No G Nj. Let z G {0,1}" and let X be a set oft > (l+ ^'g°^" ) 
samples chosen from {0, 1}" uniformly at random and mutually independent. 
Then the probability that there exists an element y € {0, 1}" such that y ^ z 
and OMy{x) = OM^(rE) for all x (z X is bounded from above by 2~*/^. 

The previous theorem is a refinement of Theorem 1 in [AW09], and 
its proof follows the proof of Theorem 1 in [AW09], clarifying some incon- 
sistencies^ in that proof. To show Theorem 7, we first give a bound on a 
combinatorial quantity used later in its proof (compare Lemma 1 in [AW09] ) . 

Proposition 8. For sufficiently large n, 

4 log log n\ 2n 
logn J logn' 

^For example, in the proof of Lemma 1 in [AW09] the following claim is made. Let d{n) 
be a monotone increasing sequence that tends to infinity. Then for sufficient large n the 
sequence fid (n) = (2[£|!i))i/{2inn) is bounded away from 1 by a constant & > 1. Clearly, 
this is not the case. For example, for d{n) = logn, the sequence hiog{n) converges to 1. 
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and even d S {2, . . . , n}, it holds that 



Proof. By Stirling's formula, we have < (^) '^'2'^. Therefore, 



2-M <L - • (2) 



^dj \\d/2j J - \dj \2 

We distinguish two cases. First, we consider the case 2 < d < n/(logn)^. 
By Stirhng's formula, it holds that (^) < {^Y- Thus, we get from (2) that 



d) \\d/2j J - \ d J \ 2 J (3) 



2(¥i°g(^)-i°s(¥))l. 



We bound d by its minimal value 2 and maximal value n/(logn)^, and t 
by 2n/logn to obtain 

2d en - vrd 1 , en 

— log — - log — < "2 log — - log vr. 

t d 2 (logn)"' 2 

Since the first term on the right hand side converges to and since 
logvr > 3/2, the exponent in (3) can be bounded from above by -3t/4, if n 
is sufficiently large. Thus, we obtain inequality (1) for 2 < d < n/(logn)^. 

Next, we consider the case n/(log n)^ < d < n. By the binomial formula, 
it holds that (^) < 2". Thus, 

d)[-2) ^'[-2) =2(-"^--)^- (4) 
We bound Tid/2 by n/(logn)3 and t by (l + to obtain 

2n , vrd logn / 

— - log — < 4i,^i„^„ - log(n/(log n) ) 

log n 

logn 

log n + 3 log log n 



4 log log n 

4 log log n 



^ ' log n 



n 



^ 31oglogn+ °fg°^ " (- log n + 3 log log 

~ I , 4 log log n 

log n 

log n — 12 log log 77, 

= "1 m — i log logn. 

log n + 4 log log n 

Again, for sufficiently large n the right hand side becomes smaller than —3/2. 
We combine the previous inequality with inequalities (2) and (4) to show 
inequality (1) for n/(logn)^ < d < n. □ 
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With the previous proposition at hand, we finally prove Theorem 7. 



Proof of Theorem 7. Let n be sufficiently large, z € {0,1}", and X a set 
of t > (l + ^ ^ log " ) k^TI samples chosen from {0, 1}" uniformly at random 
and mutually independent. 

For d e [n], let Ad := {y G {0, 1}" | n - Ou^iy) = d} be the set of 
all points with Hamming distance d from z. Let d € [n] and y (z A^- We 
say the point y is consistent with x if OMj^(x) = OMz{x) holds. Intuitively, 
this means that Otviy is a possible target function, given the fitness of x. 
It is easy to see that y is consistent with x if and only if x and y (and 
therefore x and z) differ in exactly half of the d bits that differ between y 
and z. Therefore, y is never consistent with x if d is odd and the probability 
that y is consistent with x is (d/j)^"'^ ^ even. 

Let p be the probability that there exists a point y € {0, 1}" \ {z} such 
that ?/ is consistent with all x ^ X. Then, 

p = Pr Pi "y is consistent with x"^ . 

ye{o,i}"\{z} xex 

Thus, by the union bound, we have 

p < Pr ^ "y is consistent with x''^ 

ye{0,l}"\{z} xGX 

Since, for a fixed y, the events "y is consistent with x" are mutually inde- 
pendent for all X € X, it holds that 

n 

p < JJ^ PrC'?/ is consistent with x"). 

d=l yeAa xeX 

We substitute the probability that a fixed y E {0, 1}" is consistent with a 
randomly chosen x € {0, 1}" as given above. Using \Ad\ = (^), we obtain 

de{l,...,n} : d even ^ ^ \ \ / 



Finally, we apply Proposition 8 and have p < n2 '^^^^ which concludes the 
proof since n < 2*/^ for sufficiently large n (as t in 0(n/logn)). □ 

4 The Unbiased k-Ary Black-Box Complexity of 
OneMax 

In this section, we show that higher arity indeed enables the construction of 
faster black-box algorithms. In particular, we show the following result. 
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Theorem 9. For every k € [n] with k > 2, the unbiased k-ary black-box 
complexity of OneMax„ is at most linear in n. Moreover, it is at most 
{l + Ok{l))2n/\ogk. 

This result is surprising, since in [LWIO], Lehre and Witt prove that the 
unbiased unary black-box complexity of the class of all functions / with a 
unique global optimum is r2(nlogn). Thus, we gain a factor of logn when 
switching from unary to binary variation operators. 

To prove Theorem 9, we introduce two different algorithms interesting 
on their own. Both algorithms share the idea to track which bits have 
already been optimized. That way we can avoid flipping them again in 
future iterations of the algorithm. 

The first algorithm proves that the unbiased binary black-box complex- 
ity of OneMax„ is at most linear in n if the arity is at least two. For the 
general case, with /c > 3, we give a different algorithm that provides asymp- 
totically better bounds for k growing in n. We use the idea that the whole 
bitstring can be divided into smaller substrings, and subsequently those can 
be independently optimized. We show that this is possible, and together 
with Theorem 6, this yields the above bound for OneMax„ in the fc-ary 
case for k>2>. 

4.1 The Binary Case 

We begin with the binary case. We use the three unbiased varia- 
tion operators unif ormSample (as described in Section 3), complement 
and f lipOneWhereDif f erent defined as follows. The unary operator 
complement(a;) returns the bitwise complement of x. The binary opera- 
tor f lipOneWhereDif ferent(x, y) returns a copy of x, where one of the bits 
that differ in x and y is chosen uniformly at random and then flipped. It 
is easy to see that complement and f lipOneWhereDif f erent are unbiased 
variation operators. 

Lemma 10. With exponentially small probability of failure, the optimization 
time of Algorithm 4 on the class OneMax„ is at most (l+e)2n, for alls > 0. 
The algorithm only involves binary operators. 

Proof. We first prove that the algorithm is correct. Assume that the instance 
has optimum z, for some z € {0, 1}". We show that the following invariant 
is satisfied in the beginning of every iteration of the main loop (steps 4-12): 
for all i £ [n], if Xi = yi, then Xi = Zi. In other words, the positions where 
X and y have the same bit value are optimized. The invariant clearly holds 
in the first iteration, as x and y differ in all bit positions. A bit fiip is only 
accepted if the fitness value is strictly higher, an event which occurs with 
positive probability. Hence, if the invariant holds in the current iteration, 
then it also holds in the following iteration. By induction, the invariant 
property now holds in every iteration of the main loop. 
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Algorithm 4: Optimizing OneMax with unbiased binary variation opera- 
tors. 

1 input Integer n € N and function / € OneMax„; 

2 initialization x ^ unif orniSample(); 

3 y complement (x); 

4 repeat 

5 Choose 6 € {0, 1} uniformly at random; 

6 if 6 = 1 then 

7 x' ^ f lipOneWhereDif f erent(rE, y); 

8 if f{x') > f{x) then x x'; 

9 else 

10 y' ^ f lipOneWhereDif f erent(y, x); 

11 |_ if f{y') > f{y) then y ^ y'; 

12 until /(x) = n; 

13 output x; 



We then analyze the runtime of the algorithm. Let T be the number 
of iterations needed until n bit positions have been optimized. Due to the 
invariant property, this is the same as the time needed to reduce the Ham- 
ming distance between x and y from n to 0. An iteration is successful, i.e., 
the Hamming distance is reduced by 1, with probability 1/2 independently 
of previous trials. The random variable T is therefore negative binomially 
distributed with parameters n and 1/2. It can be related to a binomi- 
ally distributed random variable X with parameters 2?7-(l + e) and 1/2 by 
Pr(T > 2n(l + e)) = Pr{X < n). Finally, by applying a Chernoff bound 
with respect to X, we obtain Pr(r > 2n(l + e)) < exp{-e^n/2{l + e)). □ 

It is easy to see that Algorithm 4 yields the same bounds on the class of 
monotone functions, which is defined as follows. 

Definition 11 (Monotone functions). Let n G N and let z € {0,1}". A 

function f : {0, 1}" — t- M is said to be monotone with respect to z if for all 
y,y' € {0,1}" with {i G [n] \ yi = Zi\ ^ {i G [n] \ y[ = Zi} it holds that 
f{y) < f{y')- The class Monotone„ contains all such functions that are 
monotone with respect to some z € {0, 1}". 

Now, let / be a monotone function with respect to z and let y and 
y' be two bitstrings which differ only in the i-th position. Assume that 
yi / Zi and y'^ = Zi. It follows from the monotonicity of / that /(y) < f{y'). 
Consequently, Algorithm 4 optimizes / as fast as any function in OneMax„. 

Corollary 12. The unbiased binary black-box complexity of MONOTONE„ 
is 0{n). 
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Note that Monotone^i strictly includes the class of linear functions with 
non-zero weights. 

4.2 Proof of Theorem 9 for Arity A; > 3 

For the case of arity k > 3 we analyze the following Algorithm 5 and show 
that its optimization time on OneMax„ is at most (1 + Ofc(l))2n/ log fc. 
Informally, the algorithm splits the bitstring into blocks of length k. The 
n/k blocks are then optimized separately using a variant of Algorithm 3, 
each in expected time (1 — Ofc(l))2A;/ log A:. 

In detail, Algorithm 5 maintains its state using three bitstrings x, y 
and z. Bitstring x represents the preliminary solution. The positions in 
which bitstrings x and y differ represent the remaining blocks to be opti- 
mized, and the positions in which bitstrings y and z differ represent the 
current block to be optimized. Due to permutation invariance, it can be 
assumed without loss of generality that the bitstrings can be expressed by 
X = a(3^, y = a(3^, and z = a/37, see Step 6 of Algorithm 5. The algo- 
rithm uses an operator called f lipKWhereDif f erent^ to select a new block 
of size i to optimize. The selected block is optimized by calling the sub- 
routine optimizeSelected^ £, and the optimized block is inserted into the 
preliminary solution using the operator update. 

The operators in Algorithm 5 are defined as follows. The operator 
f lipKWhereDif f erent^(a:, y) generates the bitstring z. This is done by mak- 
ing a copy of y, choosing £ := min{A;, H{x, y)} bit positions for which x and y 
differ uniformly at random, and flipping them. The operator update(a, b, c) 
returns a bitstring a' which in each position i G [n] independently, takes the 
value a'^ = hi if ai = Ci, and a'^ = ai otherwise. Clearly, both these operators 
are unbiased. The operators unif ormSample and complement have been 
defined in previous sections. 

It remains to define the subroutine optimizeSelected^ ^. This subrou- 
tine is a variant of Algorithm 3 that only optimizes a selected block of bit 
positions positions, and leaves the other blocks unchanged. The block is 
represented by the bit positions in which bitstrings y and z differ. Due to 
permutation-invariance, we assume that they are of the form y = aa and 
z = aa, for some bitstrings a G {0,1}^, and a € {0,1}""'^. The opera- 
tor unif ormSample in Algorithm 3 is replaced by a 2-ary operator defined 
by: randomWhereDif f erent(x, y) chooses z, where for each i € [n], the 
value of bit Zi is Xi or yi with equal probability. Note that this operator is 
the same as the standard uniform crossover operator. The operator family 
chooseConsistent in Algorithm 3 is replaced by a family of (r -|- 2)-ary op- 
erators defined by: chooseConsistentSelectedjji ^r(x^, . . . , x*", aa, aa) 
chooses za, where the prefix z is sampled uniformly at random from the 
set Zu,x = {z £ {0,1}'' I Vi G [t] OMz{x\xi ■ ■ ■ x^) = li*}. If the set Zu,x 
is empty, then z is sampled uniformly at random among all bitstrings of 
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Algorithm 5: Optimizing OneMax with unbiased fc-ary variation operators, 
for fc > 3. 

1 input Integers n,k £N, and function / € OneMax„; 

2 initialization ^ unif ormSaniple(), ^ complement (x), and 

3 foreach t € [r] do 

4 
5 
6 
7 
8 
9 



^(t) ^ min{A;, n - k{t - 1)}; 
z f lipKWhereDif f erent^^^) (x*, y*); 
Assume that x* = a/37, = a/37, and z = a/37; 
w*/37 optimizeSelected^^^^j-) (a/37, a/37); 
7i;*/37 update(a/37, a(3^); 
x*"*"^ ^D*/37 and y*"*"^ ^ w^fi'^; 



10 output x^"*"^; 



length /c. Informally, the set Z^^x corresponds to the subset of functions in 
OneMax^ that are consistent with the function values u^, u^, . . . , ti*" on the 
inputs ^ . It is easy to see that this operator is unbiased. 

Algorithm 6: optimizeSelected used in Algorithm 5. 

1 input Integers n, A; G N, and bitstrings acr and aa, where a G {0, 1}^' 
and a G {0,1}""'=; 

2 initialization r ^ min {fc - 2, [(l + ^^^) jf^] }, 

f /(ao-)+/(5o-)-fc , 

Jcr ^ 2 ) 

3 repeat 
foreach i € [r] do 

1^ xV ^ randomWhereDif f erent(acj, acr); 

WO" 

chooseConsistentSelectedj(-^.ig.)_j^ j(2.rf^)_j^ (x-'^cj, . . . ,x^a,aa, aa); 

7 until f{wa) = k + f^j] 

8 output wa\ 



Proof of Theorem 9 for arity k >3. To prove the correctness of the algo- 
rithm, assume without loss of generality the input / = OneMax for which 
the correct output is 1". 

We first claim that a call to optimizeSelected^ ^(ao", ao") will termi- 
nate after a finite number of iterations with output l^a almost surely. 
The variable f^ is assigned in line 2 of Algorithm 6, and it is easy to 
see that it takes the value fa- = f{Q^a). It follows from linearity of / 
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and from /(aO"~^) + /(aO""^) = k, that /('u;0"-^) = f{wa) - /(O^cj) = 
f{wa) — fa- The termination condition f{wa) = k + is therefore equiv- 
alent to the condition wa = l^a. For all x G {0,1}'=, it holds that 
OM(i = /(x), so 1^ is member of the set Zu,x- Hence, every invo- 

cation of chooseConsistentSelected returns l^cr with non-zero probabil- 
ity. Therefore, the algorithm terminates after every iteration with non-zero 
probability, and the claim holds. 

We then prove by induction the invariant property that for all t G [r -|- 1] , 
and i G [n], if x\ = y\ then x\ = y\ = 1- The invariant clearly holds for 

1 = 1, so assume that the invariant also holds for t = j < r. Without loss 
of generality, x^ = aPj, = a/37, x^~^^ = (3^, and y^~^^ = /3^. By 
the claim above and the induction hypothesis, both the common prefix 
and the common suffix 7 consist of only 1-bits. So the invariant holds for 
t = j + 1, and by induction also for all t £ [r + 1]. 

It is easy to see that for all t < r, the Hamming distance between 
^t+i ^ ^t^^ g^^^ yt+i ^ ^t^^ H{x^+\y^+^) = H{a^j,aPj) - l{t) = 

H{x^,y*) — i{t). By induction, it therefore holds that 

T 

H{x^+\y^+^) = H{x\y^) -Y,l{t) 

t=i 

T 

= n — min{/c, n — k(t — 1)} = 0. 

t=i 

Hence, by the invariant above, the algorithm returns the correct output 

The runtime of the algorithm in each iteration is dominated by the sub- 
routine optimizeSelected. Note that by definition, the probability that 
chooseConsistentSelectedj(2,iCT)„j^ j(2,rg.)_j^ {x^a, . . . , x''a, aa, aa) 
chooses za in {0, 1}", is the same as the probability that 
chooseConsistentj(^.i) j(^,r-)(a;"^, . . . , x'') chooses z in {0,1}^'. To 
finish the proof, we distinguish between two cases. 

Case 1: k < 53. In this case, it suffices^ to prove that the run- 
time is 0{n). For the case k = 2, this follows from Lemma 10. For 

2 < k < 53, it holds that r = k — 2. Each iteration in optimizeSelected 
uses r + 1 = k — 1 = 0(1) function evaluations, and the probability 
that chooseConsistentSelected optimizes a block of k bits is at least 
1 — (1 — 2~^Y = r2(l) (when w = l'=). Thus, the expected optimization time 
for a block is 0(1), and for the entire bitstring it is at most (n/k) ■ 0(1). 

Case 2: k > 54. In this case, r = + holds. Hence, 

with an analysis analogous to that in the proof of Theorem 6, we can show 

^ Assume that the expected runtime is less than cn for some constant c > when 
k < 53. It is necessary to show that cn < 2n/logfc + h{k)2n/ logk, for some function h, 
where hmfe_>oo h{k) 0. This can easily be shown by choosing any such function h, where 
> clogfc/2 for fc < 53. 
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that the expected runtime of optimizeSelected is at most (1 + o/^.(l))2(/c — 
2)/log(A; — 2). Thus, the expected runtime is at most n/k ■ (1 + Ofc(l))2(/c — 
2)/log(A:-2) = (l + Ofc(l))2n/logfc. □ 



5 The Complexity of LeadingOnes 

In this section, we show that allowing fc-ary variation operators, for > 1, 
greatly reduces the black-box complexity of the LeadingOnes functions 
class, namely from 0(n^) down to 0(n log n). We define the class Leading- 
Ones as follows. 

Definition 13 (LeadingOnes). Letn G N. Let a € 5„, he a permutation of 
the set [n] and let z S {0, 1}". The function Lo^^o- is defined via 1jOz^(t{x) := 
max{i G [0..n] | ^^.(j) = ^^.(j)}. We set LeadingOnes„ := {LOz^a \ z G 
{0,ir,cTG54. 

The class LeadingOnes is well-studied. Already in 2002, Droste, Jansen 
and Wegener [DJW02] proved that the classical (1 + 1) EA has an expected 
optimization time of 0(n^) on LeadingOnes. This bound seems to be 
optimal among the commonly studied versions of evolutionary algorithms. 
In [LWIO], the authors prove that the unbiased unary black-box complexity 
of LeadingOnes is 6(n^). 

Droste, Jansen and Wegener [DJW06] consider a subclass of 
LeadingOnes„, namely LeadingOnes^ := {Lo^^id | z G {0,1}"}, where 
id denotes the identity mapping on [n]. Hence their function class is not 
permutation invariant. In this restricted setting, they prove a black-box 
complexity of 0(n). Of course, their lower bound of Q.{n) is a lower bound 
for the unrestricted black-box complexity of the general LeadingOnes„ 
class, and consequently, a lower bound also for the unbiased black-box com- 
plexities of this class. 

The following theorem is the main result in this section. 

Theorem 14. The unbiased binary black-box complexity o/LeadingOnes„ 
is O(nlogn). 

The key ingredient of the two black-box algorithms that yield our upper 
bound is an emulation of a binary search which determines the (unique) bit 
that increases the fitness and does flip this bit. Surprisingly, this can be 
done already with a binary operator. This works in spite of the fact that 
we also follow the general approach of the previous section of keeping two 
individuals x and y such that for all bit positions in which x and y agree, 
the corresponding bit value equals the one of the optimal solution. 

We will use the two unbiased binary variation opera- 
tors randomWhereDif f erent (as described in Section 4.2) and 
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switchlf DistanceOne. The operator switchlf DistaiiceOne(y, y') re- 
turns y' if y and y' differ in exactly one bit, and returns y otherwise. It is 
easy to see that switchlf DistanceOne is an unbiased variation operators. 

We cah a pair (j;, y) of search points critical, if the following two condi- 
tions are satisfied, (i) f{x) > f{y). (ii) There are exactly /(y) bit-positions 
i € [n] such that Xj = yj. The following is a simple observation. 

Lemma 15. Let f € LEADlNCONESn. If{x,y) is a critical pair, then either 
f[x) = n = f{y) or f{x) > f{y). 

If f{x) > f{y), then the unique bit-position k such that flipping the k-th 
bit in X reduces its fitness to f{y) - or equivalently, the unique bit-position 
such that flipping this bit in y increases y's fitness - shall be called the 
critical bit-position. We also call f{y) the value of the pair (x,y). 

Note that the above definition does only use some function values of /, 
but not the particular definition of /. If / = LOo-,^, then the above implies 
that X and y are equal on the bit-positions ct(1), . . . , a{f{y)) and are different 
on all other bit-positions. Also, the critical bit-position is a{f{y) + 1), and 
the only way to improve the fitness of y is flipping this particular bit-position 
(and keeping the positions cr(l), . . . , cT(/(y)) unchanged). The central part 
of Algorithm 7, which is contained in lines 3 to 9, manages to transform a 
critical pair of value v < n into one of value f -|- 1 in O(logn) time. This is 
analyzed in the following lemma. 

Lemma 16. Assume that the execution of Algorithm 7 is before line 4: <ind 
that the current value of {x,y) is a critical pair of value v < n. Then after 
an expected number of O (log n) iterations, the loop in lines 5-9 is left and 
(x, y) or (y, x) is a critical pair of value v -\- 1. 

Proof. Let k be the critical bit-position of the pair (x,y). Let y' = x be a 
copy of X. Let J := {i € [n] \ yi ^ y'-}. Our aim is to flip all bits of y' with 
index in J \ {k}. 

We define y" by flipping each bit of y' with index in J with proba- 
bility 1/2. Equivalently, we can say that y^' equals y- for all i such that 
y'i = yi, and is random for all other i (thus, we obtain such y" by applying 
randomWhereDif f erent(y, y')). 

With probability exactly 1/2, the critical bit was not flipped ("success"), 
and consequently, f{y") > f{y)- In this case (due to independence), each 
other bit with index in J has a chance of 1/2 of being flipped. So with 
constant probability at least 1/2, {i G [re] | yi ^ y'/} \ {k} is at most half the 
size of J \ {k}. In this success case, we take y" as new value for y'. 

In consequence, the cardinality of J \ {k} does never increase, and with 
probability at least 1/4, it decreases by at least 50%. Consequently, after an 
expected number of O(logre) iterations, we have |J| = 1, namely J = {k}. 
We check this via an application of switchlf DistanceOne. □ 
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We are now ready to prove the main result of this section. 

Proof of Theorem 14- We regard the following invariant: {x,y) or {y,x) is 
a critical pair. This is clearly satisfied after execution of line 1. From 
Lemma 16, we see that a single execution of the outer loop does not dissatisfy 
our invariant. Hence by Lemma 15, our algorithm is correct (provided it 
terminates). The algorithm does indeed terminate, namely in 0(n log n) 
time, because, again by Lemma 16, each iteration of the outer loop increases 
the value of the critical pair by one. □ 



Algorithm 7: Optimizing LeadingOnes with unbiased binary variation 
operators. 

1 initialization x ^ unif orniSample(); y ^ complement(x); 

2 repeat 

3 
4 
5 
6 
7 
8 
9 



if f{y) > f{x) then {x,y) ^ {y,x); 
y' ^ x; 
repeat 

y" ^ randomWhereDif f erent(y, y') 
if f{y") > f{y) then y' ^ y"; 
y ^ switchlf DistanceDne(y, y'); 
until /(y) = /(y'); 

10 until f{x) = f{y); 

11 output x; 



6 Conclusion and Future Work 

We continue the study of the unbiased black-box model introduced 
in [LWIO]. For the first time, we analyze variation operators with arity 
higher than one. Our results show that already two-ary operators can allow 
significantly faster algorithms. 

The problem OneMax cannot be solved in shorter time than i}{nlogn) 
with unary variation operators [LWIO]. However, the runtime can be re- 
duced to 0(n) with binary operators. The runtime can be decreased 
even further with higher arities than two. For fc-ary variation operators, 
2 < k < n, the runtime can be reduced to 0{n/ log k), which for k = n®^^) 
matches the lower bound in the classical black-box model. A similar positive 
effect of higher arity variation operators can be observed for the function 
class LeadingOnes. While this function class cannot be optimized faster 
than il(n^) with unary variation operators [LWIO], we show that the run- 
time can be reduced to 0(n log n) with binary, or higher arity variation 
operators. 
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Despite the restrictions imposed by the unbiasedness conditions, our 
analysis demonstrates that black-box algorithms can employ new and more 
efficient search heuristics with higher arity variation operators. In particular, 
binary variation operators allow a memory mechanism that can be used to 
implement binary search on the positions in the bitstring. The algorithm 
can thereby focus on parts of the bitstring that has not previously been 
investigated. 

An important open problem arising from this work is to provide lower 
bounds in the unbiased black-box model for higher arities than one. Due 
to the greatly enlarged computational power of black-box algorithms using 
higher arity operators (as seen in this paper), proving lower bounds in this 
model seems significantly harder than in the unary model. 
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